visual localization
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
- Europe > Netherlands (0.04)
- (2 more...)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Switzerland (0.04)
- Europe > Belgium (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- Asia > Taiwan > Taiwan Province > Taipei (0.04)
- (10 more...)
- Transportation > Ground > Road (0.46)
- Information Technology > Services (0.46)
NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation
Visual localization is a fundamental task in computer vision and robotics. Training existing visual localization methods requires a large number of posed images to generalize to novel views, while state-of-the-art methods generally require dense ground truth 3D labels for supervision. However, acquiring a large number of posed images and dense 3D labels in the real world is challenging and costly. In this paper, we present a novel visual localization method that achieves accurate localization while using only a few posed images compared to other localization methods. To achieve this, we first use a few posed images with coarse pseudo-3D labels provided by NeRF to train a coordinate regression network.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
- Europe > Netherlands (0.04)
- (2 more...)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)
- South America > Brazil > São Paulo (0.04)
- North America > United States > New York (0.04)
- (11 more...)
- Transportation > Ground > Road (0.46)
- Information Technology > Services (0.46)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Canada (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Sensing and Signal Processing > Image Processing (0.95)
- Information Technology > Artificial Intelligence > Vision (0.95)
Aerial-ground Cross-modal Localization: Dataset, Ground-truth, and Benchmark
Yang, Yandi, Li, Jianping, Liao, Youqi, Li, Yuhao, Zhang, Yizhe, Dong, Zhen, Yang, Bisheng, El-Sheimy, Naser
Accurate visual localization in dense urban environments poses a fundamental task in photogrammetry, geospatial information science, and robotics. While imagery is a low-cost and widely accessible sensing modality, its effectiveness on visual odometry is often limited by textureless surfaces, severe viewpoint changes, and long-term drift. The growing public availability of airborne laser scanning (ALS) data opens new avenues for scalable and precise visual localization by leveraging ALS as a prior map. However, the potential of ALS-based localization remains underexplored due to three key limitations: (1) the lack of platform-diverse datasets, (2) the absence of reliable ground-truth generation methods applicable to large-scale urban environments, and (3) limited validation of existing Image-to-Point Cloud (I2P) algorithms under aerial-ground cross-platform settings. To overcome these challenges, we introduce a new large-scale dataset that integrates ground-level imagery from mobile mapping systems with ALS point clouds collected in Wuhan, Hong Kong, and San Francisco.
- Asia > China > Hong Kong (0.27)
- Asia > China > Hubei Province > Wuhan (0.27)
- North America > United States > California > San Francisco County > San Francisco (0.25)
- (6 more...)
ActLoc: Learning to Localize on the Move via Active Viewpoint Selection
Li, Jiajie, Sun, Boyang, Di Giammarino, Luca, Blum, Hermann, Pollefeys, Marc
Reliable localization is critical for robot navigation, yet most existing systems implicitly assume that all viewing directions at a location are equally informative. In practice, localization becomes unreliable when the robot observes unmapped, ambiguous, or uninformative regions. To address this, we present ActLoc, an active viewpoint-aware planning framework for enhancing localization accuracy for general robot navigation tasks. At its core, ActLoc employs a largescale trained attention-based model for viewpoint selection. The model encodes a metric map and the camera poses used during map construction, and predicts localization accuracy across yaw and pitch directions at arbitrary 3D locations. These per-point accuracy distributions are incorporated into a path planner, enabling the robot to actively select camera orientations that maximize localization robustness while respecting task and motion constraints. ActLoc achieves stateof-the-art results on single-viewpoint selection and generalizes effectively to fulltrajectory planning. Its modular design makes it readily applicable to diverse robot navigation and inspection tasks.
Hierarchical Image Matching for UAV Absolute Visual Localization via Semantic and Structural Constraints
Zhang, Xiangkai, Zhou, Xiang, Chen, Mao, Lu, Yuchen, Yang, Xu, Liu, Zhiyong
Absolute localization, aiming to determine an agent's location with respect to a global reference, is crucial for unmanned aerial vehicles (UAVs) in various applications, but it becomes challenging when global navigation satellite system (GNSS) signals are unavailable. Vision-based absolute localization methods, which locate the current view of the UAV in a reference satellite map to estimate its position, have become popular in GNSS-denied scenarios. However, existing methods mostly rely on traditional and low-level image matching, suffering from difficulties due to significant differences introduced by cross-source discrepancies and temporal variations. To overcome these limitations, in this paper, we introduce a hierarchical cross-source image matching method designed for UAV absolute localization, which integrates a semantic-aware and structure-constrained coarse matching module with a lightweight fine-grained matching module. Specifically, in the coarse matching module, semantic features derived from a vision foundation model first establish region-level correspondences under semantic and structural constraints. Then, the fine-grained matching module is applied to extract fine features and establish pixel-level correspondences. Building upon this, a UAV absolute visual localization pipeline is constructed without any reliance on relative localization techniques, mainly by employing an image retrieval module before the proposed hierarchical image matching modules. Experimental evaluations on public benchmark datasets and a newly introduced CS-UAV dataset demonstrate superior accuracy and robustness of the proposed method under various challenging conditions, confirming its effectiveness.
- Information Technology (0.34)
- Aerospace & Defense (0.34)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.34)